Bayes risk decoding and its application to system combination

نویسنده

  • Björn Hoffmeister
چکیده

Speech recognition is the task of converting an acoustic signal, which contains speech, to written text. The error of a speech recognition system is measured in the number of words in which the recognized and the spoken text differ. This work investigates and develops decoding and system combination approaches within the Bayes risk decoding framework with the objective of reducing the number of word errors. The investigated approaches are computationally too expensive to be applied in the speech decoder. Instead, the result of a first recognition run is used which narrows the number of hypotheses and provides the result in a compact form, the word lattice. In the single system decoding task a single word lattice is given and in the lattice-based system combination task a word lattice is provided by each system. In both cases the goal is to minimize the number of word errors in the ultimate hypothesis. In large vocabulary continuous speech recognition (LVCSR) tasks the number of word errors is computed as the Levenshtein distance between recognized and spoken text. The Bayes risk decoding framework yields the hypothesis with the least expected number of errors w.r.t. a specified loss function and given the true sentence posterior probabilities. However, neither the true probabilities are known nor is the computation of the Bayes risk hypothesis with the Levenshtein distance as loss function computationally feasible for a word lattice. Consequently, in lattice-based Bayes risk decoding and system combination two problems have to be addressed: first, how to compute an estimate for the sentence posterior probabilities given one or several word lattices; second, how to approximate the Levenshtein distance such that the computation of the Bayes risk hypothesis becomes computationally feasible. Based on the separation of the posterior probability computation and the loss function in the Bayes risk decoding rule a framework will be developed, which covers the common approaches to lattice-based system combination, like ROVER, CNC, and DMC. Furthermore, it will be shown that the common approximations of the Levenshtein distance used in LVCSR tasks can be classified into two categories for which efficient Bayes risk decoder exist. The existing approximates will be investigated and compared. New loss functions will be developed which overcome drawbacks of the existing approximations to the Levenshtein distance, like the frequently observed deletion bias. A data structure of particular interest is the confusion network (CN). In previous work it was shown that a CN has a simple decoding rule in the Bayes risk framework. In this work new algorithms for deriving a CN from a word lattice will be developed and compared to existing methods. Furthermore, the CN will be the base for several investigations aiming at improving the posterior probability estimates and the approximation of the Levenshtein distance. The methods looked into include classifier-based system combination and the usage of a windowed Levenshtein distance as loss function for the Bayes risk decoder. A further topic of research is the log-linear model combination for which the enhancement with modeland word-dependent scaling factors will be investigated. The methods are tested on the Chinese speech recognition systems used by RWTH Aachen in the GALE project and on the lattices provided within the English track of the 2007 TC-Star EPPS evaluation. The best performing system combination methods investigated in this work improve the error rates by up to 10% relative for intra-site combination experiments and by more than 20% relative for cross-site combinations compared to the best single system. The newly developed methods show a slight improvement over the existing approaches to lattice decoding and lattice-based system combination.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Generalized Minimum Bayes Risk System Combination

Minimum Bayes Risk (MBR) has been used as a decision rule for both singlesystem decoding and system combination in machine translation. For system combination, we argue that common MBR implementations are actually not correct, since probabilities in the hypothesis space cannot be reliably estimated. These implementations achieve the effect of consensus decoding (which may be beneficial in its o...

متن کامل

Bayes risk approximations using time overlap with an application to system combination

The computation of the Minimum Bayes Risk (MBR) decoding rule for word lattices needs approximations. We investigate a class of approximations where the Levenshtein alignment is approximated under the condition that competing lattice arcs overlap in time. The approximations have their origins in MBR decoding and in discriminative training. We develop modified versions and propose a new, concept...

متن کامل

Minimum Bayes-risk System Combination

We present minimum Bayes-risk system combination, a method that integrates consensus decoding and system combination into a unified multi-system minimum Bayes-risk (MBR) technique. Unlike other MBR methods that re-rank translations of a single SMT system, MBR system combination uses the MBR decision rule and a linear combination of the component systems’ probability distributions to search for ...

متن کامل

Minimum Bayes Risk decoding and system combination based on a recursion for edit distance

In this paper we describe a method that can be used for Minimum Bayes Risk (MBR) decoding for speech recognition. Our algorithm can take as input either a single lattice, or multiple lattices for system combination. It has similar functionality to the widely used Consensus method, but has a clearer theoretical basis and appears to give better results both for MBR decoding and system combination...

متن کامل

Support Vector Machines for Segmental Minimum Bayes Risk Decoding of Continuous Speech

Segmental Minimum Bayes Risk (SMBR) Decoding involves the refinement of the search space into sequences of small sets of confusable words. We describe the application of Support Vector Machines (SVMs) as discriminative models for the refined search spaces. We show that SVMs, which in their basic formulation are binary classifiers of fixed dimensional observations, can be used for continuous spe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011